Tag
8 articles
OpenAI has launched new voice intelligence features in its API, enhancing voice recognition accuracy and enabling more natural conversations. The updates have applications across customer service, education, and creator platforms.
Mistral AI's new TTS model, Voxtral, tackles the 'expressivity gap' in voice AI by combining autoregressive and flow-matching techniques for more emotionally expressive, multilingual speech synthesis.
This article explains the technical advancements behind xAI's new voice AI model, grok-voice-think-fast-1.0, and its performance improvements over existing systems.
ElevenLabs' VP warns sales candidates about long hours, 20x quotas, and high-pressure expectations. The company, valued at $11 billion, is emphasizing performance-driven culture as it scales rapidly in the AI voice market.
Salesforce AI Research introduces VoiceAgentRAG, a dual-agent memory router that cuts voice RAG retrieval latency by 316 times, enhancing conversational AI responsiveness.
Mistral AI has released a new open-source speech model optimized for smartwatches and smartphones, bringing advanced voice capabilities to edge devices. The model's efficient design enables high-quality speech synthesis on resource-constrained hardware.
OpenAI has released API upgrades featuring a new audio model and faster agent connections to improve voice reliability and developer efficiency.
OpenAI's new WebSocket mode revolutionizes low-latency voice AI experiences by enabling real-time, continuous data streaming, significantly reducing delays in voice interactions.